16 research outputs found

    Data Mining in Internet of Things Systems: A Literature Review

    Get PDF
    The Internet of Things (IoT) and cloud technologies have been the main focus of recent research, allowing for the accumulation of a vast amount of data generated from this diverse environment. These data include without any doubt priceless knowledge if could correctly discovered and correlated in an efficient manner. Data mining algorithms can be applied to the Internet of Things (IoT) to extract hidden information from the massive amounts of data that are generated by IoT and are thought to have high business value. In this paper, the most important data mining approaches covering classification, clustering, association analysis, time series analysis, and outlier analysis from the knowledge will be covered. Additionally, a survey of recent work in in this direction is included. Another significant challenges in the field are collecting, storing, and managing the large number of devices along with their associated features. In this paper, a deep look on the data mining for the IoT platforms will be given concentrating on real applications found in the literatur

    WEB-BASED DUPLICATE RECORDS DETECTION WITH ARABIC LANGUAGE ENHANCEMENT

    Get PDF
    Sharing data between organizations has growing importance in many data mining projects. Data from various heterogeneous sources often has to be linked and aggregated in order to improve data quality. The importance of data accuracy and quality has increased with the explosion of data size. The first step to ensure the data accuracy is to make sure that each real world object is represented once and only once in a certain dataset which called Duplicate Record Detection (DRD). These data inaccuracy problems exist due to due to several factors including spelling, typographical and pronunciation variation, dialects and special vowel and consonant distinction and other linguistic characteristics especially with non-Latin languages like Arabic. In this paper, an English/Arabic enabled web-based framework is designed and implemented which considers the user interaction to add new rules, enrich the dictionary and evaluate results is an important step to improve system’s behavior. The proposed framework allows the processing on both single language dataset and bi-lingual dataset. The proposed framework is implemented and verified empirically in several case studies. The comparison results showed that the proposed system has substantial improvements compared to known tools

    TCE at Qur'an QA 2022: Arabic Language Question Answering Over Holy Qur'an Using a Post-Processed Ensemble of BERT-based Models

    Full text link
    In recent years, we witnessed great progress in different tasks of natural language understanding using machine learning. Question answering is one of these tasks which is used by search engines and social media platforms for improved user experience. Arabic is the language of the Holy Qur'an; the sacred text for 1.8 billion people across the world. Arabic is a challenging language for Natural Language Processing (NLP) due to its complex structures. In this article, we describe our attempts at OSACT5 Qur'an QA 2022 Shared Task, which is a question answering challenge on the Holy Qur'an in Arabic. We propose an ensemble learning model based on Arabic variants of BERT models. In addition, we perform post-processing to enhance the model predictions. Our system achieves a Partial Reciprocal Rank (pRR) score of 56.6% on the official test set.Comment: OSACT5 workshop, Qur'an QA 2022 Shared Task participation by TC

    OFCOD: On the Fly Clustering Based Outlier Detection Framework

    No full text
    In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics

    ANFIS-based PID continuous sliding mode controller for robot manipulators in joint space

    No full text
    This paper presents a feasible design for a con- trol algorithm to synthesize an adaptive neuro-fuzzy inference system-based PID continuous sliding mode control system (ANFIS- PIDCSMC) for adaptive trajectory tracking control of the rigid robot manipulators (RRMs) in the joint space. First, a PID sliding mode control algorithm with sliding surface dynamics-based continuous proportional-integral (PI) control action (PIDSMC-SSDCPI) is presented. The global stability conditions are formulated in terms of Lyapunov full quadratic form such that the robot system output can track the desired reference output. Second, to increase the control system robustness, the PI control action in the PIDSMC- SSDCPI controller is supplanted by an ANFIS control signal to provide a control approach that can be termed adaptive neuro-fuzzy inference system-based PID continuous sliding mode control system (ANFIS-PIDCSMC). For the proposed control algorithm, numerical simulations using the dynamic model of RRM with uncertainties and external disturbances show high quality and effectiveness of the adopted control approach in high-speed trajectory tracking control problems. The simulation results that are compared with the results, obtained for the traditional controllers (standalone PID and traditional sliding mode controller (TSMC)), illustrate the fact that the tracking control behavior of the robot system achieves acceptable tracking performance

    Classification of Brain MRI Tumor Images Based on Deep Learning PGGAN Augmentation

    No full text
    The wide prevalence of brain tumors in all age groups necessitates having the ability to make an early and accurate identification of the tumor type and thus select the most appropriate treatment plans. The application of convolutional neural networks (CNNs) has helped radiologists to more accurately classify the type of brain tumor from magnetic resonance images (MRIs). The learning of CNN suffers from overfitting if a suboptimal number of MRIs are introduced to the system. Recognized as the current best solution to this problem, the augmentation method allows for the optimization of the learning stage and thus maximizes the overall efficiency. The main objective of this study is to examine the efficacy of a new approach to the classification of brain tumor MRIs through the use of a VGG19 features extractor coupled with one of three types of classifiers. A progressive growing generative adversarial network (PGGAN) augmentation model is used to produce ‘realistic’ MRIs of brain tumors and help overcome the shortage of images needed for deep learning. Results indicated the ability of our framework to classify gliomas, meningiomas, and pituitary tumors more accurately than in previous studies with an accuracy of 98.54%. Other performance metrics were also examined

    A Reliable Event-Driven Strategy for Real-Time Multiple Object Tracking Using Static Cameras

    No full text
    Recently, because of its importance in computer vision and surveillance systems, object tracking has progressed rapidly over the last two decades. Researches on such systems still face several theoretical and technical problems that badly impact not only the accuracy of position measurements but also the continuity of tracking. In this paper, a novel strategy for tracking multiple objects using static cameras is introduced, which can be used to grant a cheap, easy installation and robust tracking system. The proposed tracking strategy is based on scenes captured by a number of static video cameras. Each camera is attached to a workstation that analyzes its stream. All workstations are connected directly to the tracking server, which harmonizes the system, collects the data, and creates the output spatial-tempo database. Our contribution comes in two issues. The first is to present a new methodology for transforming the image coordinates of an object to its real coordinates. The second is to offer a flexible event-based object tracking strategy. The proposed tracking strategy has been tested over a CAD of soccer game environment. Preliminary experimental results show the robust performance of the proposed tracking strategy

    Deep-Risk: Deep Learning-Based Mortality Risk Predictive Models for COVID-19

    No full text
    The SARS-CoV-2 virus has proliferated around the world and caused panic to all people as it claimed many lives. Since COVID-19 is highly contagious and spreads quickly, an early diagnosis is essential. Identifying the COVID-19 patients’ mortality risk factors is essential for reducing this risk among infected individuals. For the timely examination of large datasets, new computing approaches must be created. Many machine learning (ML) techniques have been developed to predict the mortality risk factors and severity for COVID-19 patients. Contrary to expectations, deep learning approaches as well as ML algorithms have not been widely applied in predicting the mortality and severity from COVID-19. Furthermore, the accuracy achieved by ML algorithms is less than the anticipated values. In this work, three supervised deep learning predictive models are utilized to predict the mortality risk and severity for COVID-19 patients. The first one, which we refer to as CV-CNN, is built using a convolutional neural network (CNN); it is trained using a clinical dataset of 12,020 patients and is based on the 10-fold cross-validation (CV) approach for training and validation. The second predictive model, which we refer to as CV-LSTM + CNN, is developed by combining the long short-term memory (LSTM) approach with a CNN model. It is also trained using the clinical dataset based on the 10-fold CV approach for training and validation. The first two predictive models use the clinical dataset in its original CSV form. The last one, which we refer to as IMG-CNN, is a CNN model and is trained alternatively using the converted images of the clinical dataset, where each image corresponds to a data row from the original clinical dataset. The experimental results revealed that the IMG-CNN predictive model outperforms the other two with an average accuracy of 94.14%, a precision of 100%, a recall of 91.0%, a specificity of 100%, an F1-score of 95.3%, an AUC of 93.6%, and a loss of 0.22

    QoS optimization for cloud service composition based on economic model

    No full text
    Cloud service composition is usually long term based and economically driven. Services in cloud computing can be categorized into two groups: Application services and Computing Services. Compositions in the application level are similar to the Web service compositions in Service-Oriented Computing. Compositions in the computing level are similar to the task matching and scheduling in grid computing. We consider cloud service composition from end users perspective. We propose Genetic Algorithm-based approach to model the cloud service composition problem. A comparison is given between the proposed composition approach and other existing algorithms such as Integer Linear Programming. The experiment results proved the efficiency of the proposed approach. Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2015.This work was made possible by NPRP grant # 7 - 481-1 - 088 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.Scopu
    corecore